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Abstract 

Low-molecular-weight glutenin subunits (LMW-GS), encoded by a complex multigene family, play an important role 
in the processing quality of wheat flour. Although members of this gene family have been identified in several wheat 
varieties, the allelic variation and composition of LMW-GS genes in common wheat are not well understood. In the 
present study, using the LMW-GS gene molecular marker system and the full-length gene cloning method, a com- 
prehensive molecular analysis of LMW-GS genes was conducted in a representative population, the micro-core col- 
lections (MCC) of Chinese wheat germplasm. Generally, >15 LMW-GS genes were identified from individual MCC 
accessions, of which 4-6 were located at the Glu-A3 locus, 3-5 at the Glu-B3 locus, and eight at the Glu-D3 locus. 
LMW-GS genes at the Glu-A3 locus showed the highest allelic diversity, followed by the Glu-B3 genes, while the 
Glu-D3 genes were extremely conserved among MCC accessions. Expression and sequence analysis showed that 
9-13 active LMW-GS genes were present in each accession. Sequence identity analysis showed that all i-type genes 
present at the Glu-A3 locus formed a single group, the s-type genes located at Glu-B3 and Glu-D3 loci comprised a 
unique group, while high-diversity m-type genes were classified into four groups and detected in all Glu-3 loci. These 
results contribute to the functional analysis of LMW-GS genes and facilitate improvement of bread-making quality by 
wheat molecular breeding programmes. 

Keywords: Allele, common wheat, composition, low-molecular-weight glutenin subunits. 



Introduction 

Common wheat (Triticum aestivum L.) is one of the 'big 
three' cereal crops used for human food (Shewry, 2009) since 
wheat grains confer their viscoelastic properties to wheat 
dough (Shewry et ah, 1995). These viscoelastic properties 
allow dough to be incorporated into a wide range of daily 
food products, which are affected by glutenin and gliadin 
proteins in wheat seeds (Shewry et al, 1995; D'Ovidio and 
Masci, 2004; Juhasz and Gianibelli, 2006). Glutenin proteins 



are composed of two groups of subunits, namely high-molec- 
ular-weight and low-molecular-weight glutenin subunits 
(HMW-GS, 65-90 kDa; LMW-GS, 30^15 kDa) (Payne, 1987; 
D'Ovidio and Masci, 2004). The LMW-GS account for about 
one -third of the seed protein and 60% of glutenin proteins, 
and play an important role in determining dough properties 
and the quality of wheat food products (Gupta et al, 1991, 
1994; Branlard et al, 2001; Eagles et al, 2002; Howitt et al, 
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2006). Thus, elucidating the composition and variation of 
LMW-GS genes in common wheat and investigating the rela- 
tionship between allelic variants and end-use quality are of 
interest for wheat quality improvement (Gupta et ah, 1994; 
He et ah, 2005; Bekes et ah, 2006; Juhasz and Gianibelli, 
2006; Liu et ah, 2010; Zhang et ah, 2012). 

LMW-GS genes form a multigene family in common wheat, 
generally located at the Glu-A3, Glu-B3, and Glu-D3 loci 
on the short arms of homoeologous group 1 chromosomes 
(Jackson et ah, 1983). The copy number of LMW-GS genes 
was estimated to range from 10-20 to 30-40 (Ikeda et ah, 
2002; D'Ovidio and Masci, 2004; Juhasz and Gianibelli, 2006; 
Huang and Cloutier, 2008; Dong et ah, 2010; Zhang et ah, 
2011a, b). In Norin 61, 12 groups of LMW-GS genes were 
identified by screening the cDNA library (Ikeda et ah, 2006). 
In Glenlea, among the 12 active genes, one was assigned to 
chromosome 1A, two to chromosome IB, and nine to chro- 
mosome ID (Huang and Cloutier, 2008). In Xiaoyan 54, 14 
unique LMW-GS genes were identified using BAC (bacte- 
rial artificial chromosome) library screening and proteomics 
analysis, of which four were located at Glu-A3, three at Glu- 
B3, and seven at Glu-D3, Of the 1 1 active genes, two, two, 
and seven were i-, s-, and m-type genes, respectively (Dong 
et ah, 2010). The above three varieties contained different 
LMW-GS gene compositions, suggesting that this gene fam- 
ily has high molecular diversity among wheat varieties (Dong 
et ah, 2010). Moreover, these LMW-GS proteins had similar 
physical and chemical properties and molecular weights to 
the gliadins, which are a type of alcohol-soluble, monomelic 
seed storage protein. The high copy number and their co- 
migration with gliadins by SDS-PAGE (sodium dodecyl sul- 
fate polyacrylamide gel electrophoresis) and MALDI-TOF 
MS (matrix-assisted laser desorption ionization-time of flight 
mass spectrometry) made separation of LMW-GS proteins 
and isolation of all LMW-GS genes from a particular wheat 
variety difficult (Howitt et ah, 2006). Thus, characterization 
of the allelic variation of LMW-GS genes in wheat germ- 
plasm remains challenging. 

To dissect the LMW-GS complex, a nomenclature sys- 
tem was developed based on their relative mobility in SDS- 
PAGE (Singh et ah, 1991). Recently, their encoding genes 
were isolated using PCR with gene-specific primers. One 
m- and two i-type LMW-GS genes (GluA3-l, GluA3-2, and 
GluA3-3) were isolated from each Glu-A3 allele (Wang et ah, 
2010). At the Glu-B3 locus, four LMW-GS genes (GluB3-l, 
GluB3-2, GluB3-3, and GluB3-4) and their allelic variants 
were isolated from nine Glu-B3 alleles (Glu-B3a-Glu-B3i) 
(Wang et ah, 2009). Also, six Glu-D3 genes were identified 
from individual wheat varieties containing Glu-D3a-Glu- 
D3e (Zhao et ah, 2006, 2007). Meanwhile, allele-specific 
markers were developed to discriminate LMW-GS genes and 
their allelic variants in common wheat (Zhao et ah, 2007; 
Appelbee et ah, 2009; Wang et ah, 2009, 2010). These mark- 
ers facilitated identification of the known Glu-A3 and Glu- 
B3 alleles used in breeding programmes (Liu et ah, 2010). 
However, Glu-D3 genes were highly conserved, and allelic 
identification using PCR markers was difficult (Liu et ah, 
2010). In contrast, these allele-specific primers characterized 



only one or two genes in individual alleles, and could not be 
used to determine the exact composition of LMW-GS genes 
in individual varieties. 

To determine the composition of LMW-GS genes in 
individual wheat varieties, the LMW-GS gene molecular 
marker system and the full-length gene-cloning method 
were developed (Zhang et ah, 201 la, b), which enabled iden- 
tification and characterization of the complete sequences 
of all LMW-GS genes in any wheat variety. In the present 
study, using both methods, LMW-GS genes were inves- 
tigated in the micro-core collections (MCC) of Chinese 
wheat germplasm, which covers >70% of the genetic diver- 
sity of Chinese wheat germplasm (Hao et ah, 2011). The 
composition, organization, allelic variation, and expression 
of LMW-GS genes in 262 MCC accessions were compre- 
hensively investigated. 



Materials and methods 

Wheat germplasm 

The MCC of Chinese wheat germplasm were obtained from the 
Institute of Crop Science, Chinese Academy of Agricultural 
Sciences (CAAS). This was a representative sample of Chinese 
wheat diversity. This collection consisted of 262 accessions includ- 
ing 88 modern varieties, 157 landraces, and 17 foreign varieties, 
which accounted for >70% of the genetic diversity of the national 
collection. 

LMW-GS gene analysis 

Genomic DNAs of 262 MCC accessions were extracted from 
young leaves of seedlings with the cetyltrimethyl ammonium bro- 
mide (CTAB method) following Saghai-Maroof et al. (1984). The 
LMW-GS genes in all the MCC accessions were separated using 
the LMW-GS gene molecular marker system (Zhang et ah, 201 lb). 
Based on data from the marker system, 45 accessions containing 
almost all allelic variants were selected for RNA analysis. Total 
RNA was prepared from developing seeds at 15-21 dpa (day 
post anthesis) using TRIzol® Reagent, according to the manu- 
facturer's protocol (Invitrogen, Carlsbad, CA, USA). The RNA 
was converted into cDNA using Moloney murine leukemia virus 
(M-MLV) reverse transcriptase (Promega, Madison, WI, USA). 
The expressed LMW-GS genes were detected using the molecu- 
lar marker system (Zhang et ah, 20116). To obtain the full-length 
sequence of these LMW-GS genes, 30 representative varieties con- 
taining all the main allelic variants of each gene were selected. All 
genes were cloned and identified using the full-length gene cloning 
method (Zhang et ah, 2011a). To clone rare allelic variants, gene- 
specific primers were developed (Supplementary Table SI available 
at JXB online). Sequence analysis and characterization were per- 
formed using Lasergene software (DNAStar; http://www.dnastar. 
com/), ClustalW2 (http://www.ebi.ac.uk/Tools/msa/clustalw2/), and 
MEGA 5 software (Kumar et ah, 2008). 

Results 

Composition and variation of LMW-GS genes in MCC 
accessions 

LMW-GS genes in the MCC of Chinese wheat germplasm 
were amplified using the LMW-GS gene molecular marker 
system consisting of three independent sets of conserved 
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primers (Zhang et al., 20116). In general terms, at least 
15 LMW-GS genes were identified in each wheat variety 
(Table 1). To characterize these LMW-GS genes further, 30 
representative accessions containing the main allelic variants 
were selected, and all of their LMW-GS genes were cloned 
and sequenced using the full-length gene cloning method and 
gene-specific primers (Table 1; Supplementary Table SI at 
JXB online) (Zhang et al., 2011a). In total, 466 LMW-GS 
gene sequences were identified and deposited in GenBank 
(JX877778-JX878243). 

For each LMW-GS gene, several allelic variants were 
detected from the MCC (Table 1), some of which were iden- 
tified previously (Ikeda et al, 2006; Zhao et al, 2006, 2007; 
Huang and Cloutier, 2008; Wang et al., 2009, 2010; Dong 
et al., 2010; Zhang et al, 2011a, b). Using available mapped 
LMW-GS genes and their allelic relationship with the genes 
identified from the MCC, all these genes were assigned to 
specific wheat chromosomes. In individual accessions, 4-6 
genes were located at the Glu-A3 locus, 3-5 at the Glu-B3 
locus, and eight at the Glu-D3 locus (Table 1). These genes 
were named according to their DNA fragment size and chro- 
mosomal location. For example, the gene corresponding to 
DNA fragment 441.5, located at the Glu-D3 locus, was des- 
ignated D3-441. Most of these genes contained several allelic 
variations across the MCC. To simplify description, the 
major allelic variant was selected to represent each LMW-GS 
gene and it was named according to the following scheme: 
'DNA fragment+gene' (e.g. the A3-402 gene), while the 
allelic variants of each gene were named according to 'DNA 
fragment+allele' (e.g. the A3-402 allele). 

LMW-GS genes at the Glu-A3 locus At the Glu-A3 locus, 
4-6 LMW-GS genes were detected in each accession and several 
allelic variants were identified for each gene in the MCC (Table 1 ; 
Fig. 1). With regard to the A3-391 gene, five allelic variants, 
A3-353, A3-370a, A3-370b, A3-373, and A3-391, shared >97% 
identity (Supplementary Fig. SI at JXB online). The A3 -391 
allele predominated in 196 accessions, while A3-353 and A3-373 
were rare variants, each present in only two MCC accessions 
(Fig. la). Sequence analysis of the A3-391 genes of 30 varieties 
confirmed that allelic variants showed length polymorphisms 
in the repetitive regions, and that each variant contained its 
own single nucleotide polymorphisms (SNPs) (Supplementary 
Fig. SI). A3-353, A3-373, and A3-391 were highly conserved 
across the MCC population, whereas the A3 -370 allele could be 
further divided into two variants (A3 -3 70a and A3 -3 70b) due to 
SNPs in all available sequences (Supplementary Fig. SI). A3- 
353, A3-370a, A3-370b, and A3-391 alleles contained immature 
stop codons, and only the rare allele A3 -3 73 possessed an 
intact open reading frame (ORF) encoding an m-type subunit 
(Supplementary Table S2). Thus, the A3-391 gene was universal 
in common wheat, even though only five sequences with >98% 
identities were deposited in GenBank. 

The A3-400 gene was also common in wheat varieties. 
Seven allelic variants with different repetitive region lengths, 
A3-374, A3 -388, A3-394, A3-400, A3 -402, A3-408, and 
A3-411, were identified from MCC accessions (Fig. la). 
Sequence alignments suggested that the A3 -3 74, A3 -388, 



A3-400, A3 -408, and A3-411 alleles were conserved among 
wheat varieties, whereas both A3-394 and A3-402 com- 
prised two variants, A3-394a and A3-394b, and A3-402a and 
A3-402b, respectively, due to indels and SNPs in the available 
sequences (Supplementary Fig. S2 at JXB online). Sequence 
analysis also demonstrated that A3-402a and A3-400 shared 
the same SNP, containing a premature stop codon, while 
all remaining allelic variants (i.e. A3-374, A3-388, A3-394a, 
A3 -394b, A3 -402b, A3 -408, and A3 -4 11) contained intact 
coding sequences that might encode m-type LMW-GS in 
common wheat (Supplementary Fig. S2). 

Except for the m-type genes above, all the others identified 
at the Glu-A3 locus were i-type genes (Fig. la; Supplementary 
Table S2 at JXB online). The coding sequences of alleles A3- 
480, A3-484, A3-487, A3-502, and A3-508 contained a spe- 
cific length of repetitive regions and unique SNPs. For the 
major variant A3-502, eight allelic variants (A3-502a-A3- 
502h) were recognized due to SNPs and indels in the coding 
sequences (Fig. la; Supplementary Fig. S3). For the other 
i-type genes, eight genes/haplotypes, A3-620, A3-626, A3- 
643, A3-646, A3-649, A3-573IA3-640, A3-567IA3-590, and 
A3-565IA3-568IA3-662, were detected, of which alleles A3- 
626 and A3-646 were further divided into two allelic variants, 
respectively, due to SNPs in the coding sequences. Moreover, 
two genes, A3-649-1 and A3-649-2 with unique SNPs and 
indels, which shared a 649 bp DNA fragment, were identified 
from individual accessions; the A3 -567-1 and A3 -567-2 genes 
exhibited similar commonality. After analysing the compo- 
sition of i-type genes in 262 MCC accessions, it was found 
that all A3-502 variants were tightly linked with other unique 
i-type genes. The A3-502a and A3-502b alleles were coupled 
with A3-620, A3-502c with A3-626a, A3-502d with A3-643, 
A3-502e with A3-646a, A3-502J with A3 -646b, A3 -502 g with 
A3-573 and A3-640, and A3-502h with A3-649-1 and A3- 
649-2. In total, 12 i-type haplotypes were identified in MCC 
accessions (Fig. la). 

LMW-GS genes at the Glu-B3 locus Generally, excluding 
11 1BL/1RS translocation lines, 3-5 Glu-B3 genes were 
identified in each MCC accession, and B3-530 and B3-548 
genes were universal (Fig. lb). With regard to the B3-530 
gene, three allelic variants, B3-510, B3-516, and B3-530, 
were identified (Table 1; Fig. lb), and shared high sequence 
identity (>99%). The B3-530 allele was further divided into 
three variants (B3-530a, B3-530b, and B3-530c) due to the 
unique SNPs (Supplementary Fig. S4 at JXB online). All 
allelic variants of the B3-530 gene possessed intact ORFs 
and their deduced proteins belonged to m-type LMW-GS. 
For the B3-548 gene, the B3-548 allele predominated in 241 
accessions, while the rare allelic variant B3-55 7 was detected in 
only one (Table 1; Fig. lb). However, both variants contained 
premature stop codons in their coding sequences, suggesting 
them to be pseudogenes. 

The other Glu-B3 genes were identified only in partial MCC 
accessions (Fig. 1). The B3-570 gene was identified in 44 MCC 
accessions, and its intact ORF contained 344 amino acid resi- 
dues and encoded an m-type gene with the novel N-terminal 
sequence,METSQIPGLEKPS. The.BJ-.57S, B3-621 , W&B3-544 
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Fig. 1. Composition of LMW-GS genes at the Glu-3 loci in 
micro-core collections (MCC) of Chinese wheat germplasm. 
The diagrams illustrate the LMW-GS genes and their allelic 
variants at the Glu-A3, B3, and D3 loci identified from MCC 
accessions. The horizontal axis of each diagram shows the 
allelic variants of individual genes or haplotypes identified from 
the MCC. The vertical axis displays the composition of unique 
genes and haplotypes in individual accessions. The length of line 
segments represents the number of accessions containing the 
corresponding allelic variants. Underlined allelic variants were rare 
in the MCC, and allelic variants in red were active in common 
wheat. The allelic variants indicated by numbers and letters, such 
as 370a and 370b, shared the same DNA fragment but had 
different nucleotide sequences. (A) The compositions of genes and 
their allelic variants at the Glu-A3 locus. The i-type genes showed 
high diversity and were tightly linked, forming specific haplotypes. 



genes were tightly linked at the Ghi-B3 locus and formed sev- 
eral haplotypes (Fig. lb). The B3-578 gene was classified into 
two variants (B3-578a and B3-578b) based on two SNPs. 
The B3-544 genes were conserved among accessions, and 
the allelic variants B3-544 and B3-587-B3-607 shared >99% 
sequence identity (Supplementary Fig. S5 at JXB online). 
Also, B3-621 and B3-624 differed in only a CAA indel and an 
SNP (Supplementary Fig. S6). Moreover, B3-578 contained 
two premature stop codons in its coding sequences, while the 
other genes possessed intact ORFs, which might be active in 
common wheat. Sequence analysis showed that all three genes 
were s-type, the deduced protein sequences of which con- 
tained a MENSHIPGLERPS peptide at the N-terminus. B3- 
688 could be divided into three groups (B3-688a, B3-688b, and 
B3-688c) in MCC accessions, although several irregular SNPs 
were found in the coding sequences (Supplementary Fig. S6). 
These irregular SNPs were tightly linked with B3-691 or B3- 
813 at the Glu-B3 locus, forming different haplotypes, namely 
B3-688a/B3-691, B3-688blB3-813, and B3-688clN (Fig. lb). 
B3-688a was coupled with B3-691, and shared >99% identity, 
the only differences being a CAA indel and an SNP. B3-813 
was identified for the first time in common wheat and con- 
tained a premature stop codon in the ORF, while both B3-688 
and B3-691 had unbroken ORFs encoding s-type LMW-GS 
in common wheat. 

LMW-GS genes at the Glu-D3 locus Eight LMW-GS genes 
(D3-385, D3-393, D3-394, D3-441, D3-525, D3-575, D3-578, 
and D3-586) were detected at the Glu-D3 locus in individual 
wheat varieties (Table 1; Fig. lc). Neither D3-385 nor D3- 
575 preserved any allelic variants and were universal in all 
MCC accessions. In terms of the D3-393 gene, the D3-393 
allele was present in 259 MCC accessions, while the other rare 
allelic variant, D3-385', was found in only three (Fengkang 
2, Guinong 10, and Lovrin 10). For the D3-394 gene, both 
allelic variants, D3-394 and D3-397, shared >99% sequence 
identities, the difference being an indel (CAA) and two 
SNPs; the latter allele was detected in only a single landrace, 
Yizhimai. Three conserved allelic variants of D3-441 (D3- 
432, D3-441, and D3-444) were identified with CAA indels in 
the repetitive region (Supplementary Fig. S7 at JXB online). 
The three allelic variants of the D3-525 gene, namely D3- 
522, D3-525, and D3-528, exhibited a similar phenomenon 
(Supplementary Fig. S8). For the D3-578 gene, D3-578a was 
identical in length but had a different nucleotide sequence 
from that of the the D3-578b variant (Supplementary Fig. 
S9). This gene encoded the only s-type LMW-GS at the 
Glu-D3 locus in common wheat. The nucleotide sequences 
of D3-586 and its allelic variants were identical, except for 
the CAA indels in the repetitive region that contributed to 
the length polymorphism (Supplementary Fig. S10). Among 
the eight LMW-GS genes at the Glu-D3 locus, D3-393 was a 



(B) The compositions of genes and their allelic variants at the Glu- 
B3 locus. The s-type genes were tightly linked and formed specific 
haplotypes. (C) The composition of eight genes and their allelic 
variants at the Glu-D3 locus. 
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pseudogene due to a frameshift mutation and D3-586 was a 
pseudogene due to a premature stop codon. The remaining 
six genes contained intact ORFs that encoded one s-type 
(D3-578) and five m-type LMW-GS at the Glu-D3 locus, 
which contained the largest number of active genes of all the 
Glu-3 loci. 

Organization of LMW-GS genes in MCC accessions 

Using the LMW-GS gene molecular marker system and the 
full-length gene cloning method (Zhang et al., 2011a, b), 
almost all LMW-GS genes were detected in the MCC and 
were cloned and characterized in 30 MCC accessions, which 
facilitated investigation of the organization of LMW-GS 
genes and their linkage relationship. 

At the Glu-A3 locus, 4-6 genes were generally isolated from 
individual MCC accessions, including A3-391, A3-400, and 
2-4 i-type genes (e.g. A3-502a and A3-620; A3-502g, A3-573, 
and A3-640; and A3-484, A3-565, A3-568, and A3-662; Figs 
la, 2a). Although several allelic variants were identified for 
each gene, only 11 main types of Glu-A3 genotypes were 
detected in MCC accessions, suggesting that these genes were 
tightly linked (Fig. 2a). Two genotypes containing A3-391, 
A3-400 or A3-402a, A3-502, and A3-620 accounted for -70% 
of the MCC accessions, while each of the remaining geno- 
types accounted for <7% (Fig. 2a). Moreover, concerning 
the distribution of different genotypes in foreign accessions, 
Chinese modern varieties, and landraces, genotypes contain- 
ing the A3-649-1I-2 genes were detected only in landraces, 
while those containing A3-394alb were present in both for- 
eign accessions and Chinese modern varieties (Fig. 2a). 

Based on the genotype data of MCC accessions, the 
linkage relationship among LMW-GS genes at the Glu-A3 
locus was analysed (Fig. 2a). First, four main haplotypes 
of i-type genes were detected in MCC accessions, namely 
A3-502alblA3-620, A3 -5021 A3 -646, A3 -502 hi A3 -649- 
1IA3-649-2, and A3-484IA3-565IA3-568IA3-662 (Figs la, 
2a). The i-type genes were completely linked and formed 
specific haplotypes in common wheat. Secondly, the A3- 
391 allele was coupled with alleles A3-400 and A3-402a, 
and the A3-370alb alleles co-segregated with A3-394b, 
A3-402b, A3-408, and A3-411 (Fig. 2a). m-Type genes may 
have been tightly linked with each other at the Glu-A3 locus. 
Thirdly, the A3-370alb alleles were generally coupled with 
A3-502(e/f)/A3-646 and A3-484IA3-565IA3-568IA3-662, 
while the A3-391 allele was linked with A3-502{alb)IA3-620, 
A3 -502 hi A3 -649-11 A3 -649-2, and A3-502g/A3-573/A3-640 
(Fig. 2a). Thus, the A3-391 gene generally linked with i-type 
genes in MCC accessions. 

At the Glu-B3 locus, although 3-5 genes were present in indi- 
vidual accessions, their allelic variants consisted of 12 main 
genotypes in the MCC accessions (Figs lb, 2b). Genotypes 
containing allelic variants B3-621, B3-624, or B3-688 cov- 
ered all MCC accessions, excluding 11 1BL/1RS transloca- 
tion lines, and genotypes containing the haplotypes B3-688IN 
or B3-688IB3-691 accounted for 54% of the MCC (Fig. 2b). 
Tight linkage of LMW-GS genes at the Glu-B3 locus was 
also observed. The B3-510 allele was tightly coupled with the 



B3-570 gene, forming the haplotype B3-510IB3-570, which 
was generally linked with another haplotype B3-688IB3-691 
(Fig. 2b). In contrast, s-type genes consisted of two groups 
of haplotypes at the Glu-B3 locus; one contained the B3- 
621 gene and the other the B3-688 gene (Fig. lb). The for- 
mer group formed various haplotypes, which contained three 
unique LMW-GS genes, namely B3-578, B3-544 and B3-621, 
as well as their allelic variants. Alleles B3-544, B3-601, B3- 
604, and B3-607 regularly co-segregated with B3-621, while 
the other variants B3-590, B3-593, and B3-596 were usually 
coupled with B3-624 (Fig. 2b). In terms of the distribution of 
these genes in MCC, 1BL/1RS lines and genotypes contain- 
ing B3-601 and B3-604 were identified only in foreign or mod- 
ern varieties, whereas the genotypes containing B3-624 and 
the genotype B3-530IB3-548IB3-688IB3-691 were detected 
mostly in landraces (Fig. 2b). 

At the Glu-D3 locus, although eight LMW-GS genes 
were identified from individual accessions, few haplotypes 
were characterized throughout the entire MCC population 
because of the high conservation of LMW-GS genes at this 
locus (Fig. 2c). Among them, D3-578a was linked with the 
D3-432 allele, whereas D3-578b was generally coupled with 
D3-441 (Fig. 2c). The D3-586 gene showed high length poly- 
morphisms among MCC accessions for six allelic variants 
D3-583I586I589I591I594I597, which resulted in detection of 
14 main genotypes at the Glu-D3 locus in MCC accessions 
(Fig. 2c). Two genotypes, D3-385ID3-393ID3-394ID3-441ID3- 
525ID3-575ID3-578ID3-586 and D3-385ID3-393ID3-394ID3- 
441ID3-525ID3-575ID3-578ID3-589, differed only in the 
pseudogene D3-586, which accounted for 64.8% of the MCC 
accessions (Fig. 2c). 

Expression of LMW-GS genes in the MCC accessions 

Since only the active genes in this family affected bread-mak- 
ing quality, the mRNA of the developing seeds was investi- 
gated (Fig. 3) and active LMW-GS genes were identified by 
comparing the mRNA and genomic DNA data (Fig. 3). 

At the Glu-A3 locus, among the 4-6 genes detected in 
genomic DNA (Fig. 2a), the A3-370alb and A3-391 alleles 
were not detected in mRNA of developing seeds (Fig. 3), 
while their allelic variant A3 -37 3 may have been active in 
the intact ORF In terms of the A3-400 gene, the A3-400 
and A3-402a alleles were inactive, whereas both A3-402b 
and A3-408 were expressed during seed filling (Fig. 3). 
Also, the rare allelic variants A3 -374, A3 -388, A3-394alb, 
and A3-411 may encode LMW-GS proteins in their intact 
ORFs. None of the allelic variants of the A3-502 gene 
was expressed due to premature stop codons, except for 
the active allele A3-502h. Among the other i-type genes, 
A3-573, A3-620, A3-646, A3-649-2, A3-568, and A3-662 
were detected in developing seeds (Fig. 3), and the rare 
allelic variants A3-626alb, A3-643, A3 -567-1, A3 -567-2, 
and A3-590 contained intact ORFs, which might also be 
active in common wheat. Thus, in a particular wheat vari- 
ety, 1-3 LMW-GS genes might be active at the Glu-A3 
locus (Fig. 2a). For example, accessions with the genotype 
A3-391IA3-400IA3-502IA3-620 contained only one active 
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Fig. 2. Main genotypes of LMW-GS genes at Glu-3 loci identified from the MCC. Genotypes present in more than three MCC 
accessions are shown. LMW-GS genes at the same locus were generally linked and formed limited types of genotypes. Genes in red 
were active in common wheat. Moreover, the modern varieties, landraces, and foreign varieties are indicated by different colours. The 
genotypes most common in landraces are indicated by red asterisks, and those found only in modern and foreign varieties by blue 
asterisks. (A) Eleven genotypes at the Glu-A3 locus. (B) Thirteen genotypes at the Glu-B3 locus. Of these, 1 1 accessions without 
LMW-GS genes belong to 1 B/1 R translocation lines. (C) Fourteen genotypes at the Glu-D3 locus. 



LMW-GS gene, A3-620, while those with the i-type haplo- 
type A3-484I A3-565I A3-568I A3-662 possessed three active 
genes at the Glu-A3 locus (Fig. 2a). 

At the Glu-B3 locus, 3-5 genes were detected in the 
genomic DNA of individual accessions (Fig. 2b). Based on 
the electropherogram of the LMW-GS gene marker system, 
the B3-548, B3-578alb, and B3-813 genes were not detected 
in developing seeds (Fig. 3), while the remaining m-type 



genes (B3-530 and the newly identified B3-570) and s-type 
genes (B3-544, B3-621, and B3-688) were generally expressed 
in wheat varieties, although B3-570 was only present in par- 
tial MCC accessions (Figs 2b, 3). With regard to active genes 
of different haplotypes, the number of active Glu-B3 genes 
varied from two to four in one wheat variety, including one 
or two m-type-encoding genes and one or two s-type-encod- 
ing genes (Fig. 2b). 
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Fig. 3. Expression analysis of LMW-GS genes in MCC accessions. Electropherograms show the patterns of DNA fragments detected 
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concentration of DNA fragments in the PCR products). The orange peaks indicate the GeneScan 1 200 LIZ size standard fragments, while 
the blue peaks represent the DNA fragments in the PCR products. In the genomic DNA electropherogram, peaks indicated by arrows 
and numbers were detected only in genomic DNA, all of which were pseudogenes. Peaks in the cDNA electropherograms indicated by 
numbers correspond to active LMW-GS genes. Glu-A3 genes are shown in black, Glu-B3 genes in green, and Glu-D3 genes in pink. 



At the Glu-D3 locus, of the eight LMW-GS genes detected 
in genomic DNA, D3-385, D3-394, D3-441, D3-525, D3-575, 
and D3-578 were detected in developing seeds of MCC acces- 
sions (Fig. 3), and the rare allelic variants D3-397, D3-444, 
and D3-522 might be active in their intact ORE The remain- 
ing two genes, D3-393 and D3-586, were not identified in the 
developing seeds probably due to premature stop codons 



(Fig. 3). Thus, one s-type (D3-578) and five m-type genes (D3- 
385, D3-394, D3-441, D3-525, and D3-575) were generally 
expressed at the Glu-D3 locus in individual wheat varieties 
(Fig. 2c). These expression analyses revealed that individual 
accessions with different allelic variants or haplotypes might 
possess different numbers of active genes. Generally, none or 
one m-type and one or two i-type active genes were detected 
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at the Glu-A3 locus, and one or two m-type and one or two 
s-type genes at the Glu-B3 locus were expressed, whereas one 
s-type and five m-type genes comprised the active genes at the 
Glu-D3 locus. Thus, the number of active genes in individual 
MCC accessions varied from nine to 13. 

Characteristics of LMW-GS genes identified from MCC 
accessions 

All proteins encoded by the genes identified in the present 
study were typical of LMW-GS, which had similar structures 
to previously characterized LMW-GS (D'Ovidio and Masci, 
2004; Juhasz and Gianibelli, 2006). Each deduced protein con- 
tained four main structural domains: a signal peptide, a short 
conserved N-terminal domain, a repetitive domain, and a 
C-terminal domain, except for i-type proteins, which lacked the 
N-terminal domain (Supplementary Fig. Sll at JXB online). 
Based on the N-terminal sequence of mature proteins, three 
types of LMW-GS (m-, s-, and i-types) were recognized. The 
m-type proteins were the most abundant in all genotypes ana- 
lysed, and their molecular mass varied from 31.8kDa (D3- 
385) to 39.6 kDa (D3-575). The s-type proteins generally had a 
higher molecular mass than did m-type subunits, which ranged 
from 37.0kDa (B3-544) to 42.5kDa (B3-691). Also, the i-type 
proteins had higher molecular weights (39.2^13.0kDa), despite 
lacking the N-terminal sequences. 

Cysteine residues played a vital role in determining the 
structural and functional characteristics of wheat proteins 
(Shewry et a/., 1995; D'Ovidio and Masci, 2004). All deduced 



proteins identified in this study possessed eight cysteine resi- 
dues, except the putative amino acid sequences from pseudo- 
genes A3 -502 d and D3-385', which contained seven and nine 
cysteine residues, respectively. However, both pseudogenes 
do not play a role in glutenin polymers and bread-making 
quality. The locations of the first (or third for i-type genes) 
and seventh cysteines were highly diverse, while the remain- 
ing six cysteines were conserved among all LMW-GS genes 
(Supplementary Fig. Sll at JXB online). Based on the rela- 
tive locations of cysteines, LMW-GS proteins were divided 
into six groups (Supplementary Fig. Sll). 

All LMW-GS genes and their allelic variants were subjected 
to cluster analysis using ClustalW2 and MEGA 5. Also, the 
six main groups were further divided, which was consistent 
with the grouping data based on the cysteine positions of the 
deduced proteins (Fig. 4). The i-type genes located at the Glu- 
A3 locus formed a single group (i A ), and all the s-type genes 
at the Glu-B3 locus and Glu-D3 locus were located in a single 
branch (s BD ) (Fig. 4). All the remaining LMW-GS genes were 
m-type, which were further divided into four groups (Fig. 4). 
Variants of both the D3-441 and D3-525 genes formed sin- 
gle groups (m D -2 and m D -l, respectively), which were unique 
at the Glu-D3 locus. The m BD group was composed of three 
genes (B3-530, B3-548, and B3-570) from the Glu-B3 locus 
and two (D3-575 and D3-586) from the Glu-D3 locus. In 
contrast, the m AD group contained five m-type genes, two of 
which were located at the Glu-A3 locus and three at the Glu- 
D3 locus. Collectively, genes from the Glu-A3 locus contained 
all the i-type genes (i A ) and m-type genes (m AD ), genes from 



■type genes 




Fig. 4. Phylogenetic reconstruction of all LMW-GS genes and their allelic variants identified from MCC accessions. The phylogenetic 
tree of LMW-GS genes was constructed using MEGA 5 (Kumar ef al. 2008). All LMW-GS genes were divided into six groups. The i-type 
genes located at Glu-A3 were special and formed a single group (U). The s-type genes at Glu-B3 and Glu-D3 shared high identity and 
were located in a single branch (s BD ). The other LMW-GS genes were of the m-type, and were divided into four groups (m AD , m D _i, m BD , 
and m D _ 2 ). LMW-GS genes at the Glu-D3 locus were assigned to five groups, and showed a higher diversity than those at the Glu-A3 
and Glu-B3 loci. (This figure is available in colour at JXB online.) 
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the Glu-B3 locus were associated with two groups, includ- 
ing s-type (s BD ) and m-type (m BD ) genes, and genes from the 
Glu-D3 locus were distributed into five of the six groups and 
showed higher diversity than those at the Glu-A3 and Glu-B3 
loci (Fig. 4). 

Of the LMW-GS genes, i-type genes were the most com- 
plex and 12 haplotypes were detected, each of which con- 
tained unique LMW-GS genes. Sequence alignments and 
phylogenetic analysis of all i-type genes demonstrated that 
the A3 -502 gene was conserved (>85% diversity) among the 
haplotypes, whereas the other genes could be divided into five 
subgroups (Supplementary Fig. S12 at JXB online). A3-626 
and A3-643 shared high identity (>98%) and formed the sub- 
group i A -l with the A3 -502 gene. The haplotypes A3-502e/f 
and A3-646 were named i A -2. The A3-502g/A3-573/A3-640 
and A3-502hlA3-649-llA3-649-2 haplotypes contained three 
i-type genes and were classified into subgroups i A -3 and i A -4, 
respectively. A3-484IA3-565IA3-568IA3-662 and A3-487/A3- 
567-11 A3 -567-2/ A3 -590 represented a unique i-type genotype 
(i A -5) in common wheat (Supplementary Fig. S12). 

Discussion 

LMW-GS genes in common wheat are complex, and their 
exact copy number remains unclear (Cassidy et al, 1998; 
Ikeda et al, 2002; Juhasz and Gianibelli, 2006; Huang and 
Cloutier, 2008; Dong et al, 2010). Recently, using BAC library 
screening, 14 and 19 genes were isolated from the common 
wheat varieties Xiaoyan 54 and Glenlea, respectively (Huang 
and Cloutier, 2008; Dong et al, 2010). Meanwhile, LMW-GS 
genes at the Glu-A3, Glu-B3, and Glu-D3 loci were identified 
using gene-specific primers, which suggested that at least 12 
genes are present in the common wheat genome (Zhao et al, 
2006, 2007; Wang et al, 2009, 2010). Based on the conserved 
and polymorphic structures of these genes, the LMW-GS 
gene marker system and the full-length gene cloning method 
were developed, which can identify >15 members of this gene 
family in common wheat (Zhang et al., 201 la, b). In the pre- 
sent study, both methods were used to investigate the MCC 
of Chinese wheat germplasm, and the complex LMW-GS 
gene family in common wheat was successfully dissected. 

Dissection of LMW-GS genes at individual Glu-3 loci 

GIU-A3 locus Two m-type genes and 2-4 i-type LMW-GS 
genes were generally identified at the Glu-A3 locus, which was 
the highest number reported for this locus in individual wheat 
varieties (Figs la, 2a). The m-type gene, A3-391, and its allelic 
variants shared high identities (>99%) with a few sequences in 
GenBank derived from T. macha, T. durum, and T. timopheevii 
(Supplementary Table S2 at JXB online), which suggests that 
A3-391 might be widely present in Triticum. The other m-type 
gene at the Glu-A3 locus, A3-400, was reported by several 
groups, corresponding to GluA3-2 genes from Aroona near- 
isogenic lines (NILs) (Wang et al, 2010), the group 6 type 
IV gene from Norin 61 (Ikeda et al., 2002), and A3-1 from 
Xiaoyan 54 (Supplementary Table S2) (Dong et al, 2010). 



The present results provided direct evidence for the presence 
of m-type genes at the Glu-A3 locus in common wheat, and 
this gene showed high diversity among MCC accessions with 
several novel allelic variants (i.e. A3 -3 74, A3 -388, A3 -3 94a, 
and A3 -4 11; Fig. la). Moreover, the new allelic variants, A3- 
388, A3-394alb, A3 -408, and A3-411, contained intact ORFs 
and may make specific contributions to wheat bread-making 
quality. 

In the present study, using conserved primers, 2-A 
i-type genes were identified in individual wheat varie- 
ties. (Supplementary Table S2 at JXB online). In previ- 
ous studies, 1-3 i-type genes in only a few wheat varieties 
were characterized (Supplementary Table S2) (Zhang et al, 
2004; Ikeda et al, 2006; Huang and Cloutier, 2008; Dong 
et al, 2010), which made it difficult to analyse the relation- 
ships among haplotypes of these genes. Here, the MCC of 
Chinese wheat germplasm were investigated in terms of 
LMW-GS gene composition. The i-type genes were present 
in the wheat genome as haplotypes rather than single genes, 
and 12 haplotypes of i-type genes were detected in the MCC. 
Nucleotide sequence comparisons showed that genes in six 
of 12 haplotypes identified in this study were similar (>99%) 
to those isolated from seven Glu-A3 alleles, for example hap- 
lotype A3-502dl643 corresponding to GluA3-32IGlu-A3-12 
from Glu-A3b (Supplementary Table S2) (Wang et al, 2010; 
Zhang et al, 2012). In addition, haplotype A3-484IA3- 
5651 A3-568I A3-662 contained the three i-type genes identi- 
fied in Norin 61 and Xiaoyan 54 (A3-2, A3-3, and A3-4), and 
the haplotype A3-502fl646b covered the i-type gene detected 
in Glenlea (EU189087) (Huang and Cloutier, 2008; Dong 
et al, 2010). This confirms that i-type genes in common 
wheat exist as haplotypes at the Glu-A3 locus and exhibit 
high genetic diversity. The identification and characteriza- 
tion of these haplotypes will facilitate the functional analy- 
sis of i-type genes and the selection of specific genes using 
haplotype-specific markers. 

Glu-B3 locus Three to five Glu-B3 genes were detected in 
individual varieties, of which 1-3 were s-type and two or 
three were m-type (Figs lb, 2b). The m-type gene B3-530 
shared >99% sequence identity with GluB3-4 genes from 
Aroona and its near-isogenic lines, the B3-1 gene from 
Xiaoyan 54 and Jing 411, 75 57N24-M from Glenlea, and the 
group 2 type I gene from Norin 61 (Supplementary Table S2 
at JXB online) (Ikeda et al, 2002; Huang and Cloutier, 2008; 
Wang et al, 2009; Dong et al, 2010), while B3-548 has been 
reported only rarely since it is a pseudogene. The third m-type 
gene, B3-570, was newly identified from wheat varieties and 
was detected in partial MCC accessions containing B3-510 
(Fig. 2b). Thus, at least two m-type genes were present at 
the Glu-B3 locus, rather than the one reported previously 
(Ikeda et al, 2002; Huang and Cloutier, 2008; Dong et al, 
2010). The other genes at the Glu-B3 locus were of s-type, 
and were divided into two subgroups based on their gene 
composition; one containing the B3-688 gene and the other 
containing B3-621 (Figs lb, 2b). The B3-688 subgroup of 
s-type haplotypes corresponded to B3-2 from Xiaoyan 54 
as well as Jing 411 and GluB3-3 from Aroona- Glu-B3c, B3d, 
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B3h, and B3i (Supplementary Table S2) (Wang et al, 2009; 
Dong et al., 2010; Zhang et al., 2012). The other subgroup 
contained two active genes, B3-544 and B3-621, and one 
pseudogene, B3-578. B3-544 had 99% sequence identity 
with GluB3-l from Axoom-Glu-B3a, B3b, B3f, and B3g. 
Also, B3-621 genes were present in Aroona-Glu-B3a, B3b, 
B3f, and B3g, corresponding to GluB3-2 (Wang et al, 2009; 
Zhang et al, 2012) (Supplementary Table S2). Thus, s-type 
genes existed as haplotypes in common wheat. However, 
both subgroups of s-type genes displayed significant 
differences in terms of gene composition and sequences, and 
thus might make different contributions to dough quality. 
Overall, evaluation of Glu-B3 genes/haplotypes will enable 
development of haplotype-specific primers for marker- 
assisted selection. 

Glu-D3 locus One s-type and seven m-type genes were 
identified at the Glu-D3 locus from a single wheat genotype, 
which was by far the highest number of LMW-GS genes 
reported for this locus (Fig. 2c). Pseudogene D3-586 was 
newly detected in common wheat, whereas the other seven 
genes have been investigated extensively (Supplementary 
Table S2 at JXB online), and covered all Glu-D3 genes 
identified in wheat varieties, including Norin 61, Glenlea, 
Xiaoyan 54, Jing 411, and Aroona NILs (Supplementary 
Table S2) (Ikeda et al, 2002; Zhao et al, 2006, 2007; Huang 
and Cloutier, 2008; Dong et al, 2010; Zhang et al, 2011a, 
b). These Glu-D3 genes were highly conserved, with only a 
few allelic variants (>99% identities), of which three novel 
active allelic variants (D3-397, D3-444, and D3-522) were 
detected with unique SNPs or indels in MCC, but functional 
analysis of these variants has been limited. Moreover, these 
Glu-D3 genes shared >97% identities with LMW-GS genes 
isolated from Aegilops tauschii (Johal et al, 2004; Dong 
et al, 2010), which further confirmed the conservation of 
Glu-D3 genes. 

Relative genetic locations of LMW-GS genes at the 
Glu-3 loci 

Typical LMW-GS genes were located at the Glu-A3, Glu- 
B3, and Glu-D3 loci on the homoeologous group 1 chromo- 
somes. However, little is known about the relative location 
of LMW-GS genes at individual loci due to the complexity 
of gene composition and the lack of appropriate methods 
of investigating this gene family. Recently, the recombina- 
tion of 14 LMW-GS genes at Glu-3 loci was analysed and the 
relative genetic position of these genes was determined (Dong 
et al, 2010). Subsequently, four more genes were detected and 
sequenced in Xiaoyan 54 (Zhang et al, 2011a, b). In the pre- 
sent study, based on the allelic relationship with the genes in 
Xiaoyan 54, all the LMW-GS genes in the MCC were located 
at a specific position in homoeologous group 1 chromosomes 
(Fig. 5). At the Glu-A3 locus, two groups of LMW-GS genes, 
m AD and i A gene clusters, were found and little recombination 
was detected within groups, of which the i A group was distal 
and the m AD group was proximal to the centromere (Figs 2, 5). 
At the Glu-B3 locus, although the relative position of A3-548 
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Fig. 5. Organization of the LMW-GS genes in homoeologous 
group 1 chromosomes. The relative locations of genes or 
haplotypes were determined based on Dong et al. (2010). 
The main allelic variants are displayed as representatives. The 
distances among genes do not represent the genetic or physical 
distances. (A) Relative genetic positions of LMW-GS genes at the 
Glu-A3 locus. The i-type genes were tightly linked at the Glu-A3 
locus and formed five haplotype subgroups, and both m-type 
genes also generally co-segregated in the MCC. (B) Relative 
genetic positions of LMW-GS genes at the Glu-B3 locus. The 
s-type genes were coupled at the Glu-B3 locus and were of two 
principal haplotypes. (C) Relative genetic positions of the eight 
LMW-GS genes at the Glu-D3 locus. Of these, only D3-411 and 
D3-578 were tightly linked. 



was not determined, the m- and s-type genes might exist as 
two gene clusters (m BD and s BD ). Also, m-type genes were 
more proximal to the centromere than s-type genes (Figs 2, 5). 
At the Glu-D3 locus, tight linkage was identified only between 
D3-441 and D3-578a or D3-432 and D3-578b (Fig. 5), which 
could be explained by the close physical proximity (15.9kb) of 
these genes (Dong et al, 2010). Additionally, D3-385, D3-393, 
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D3-394, and D3-525 genes were located in close proximity 
and had a high identity (Figs 4, 5). Based on the location and 
sequence analysis, LMW-GS genes that show high identity or 
belong to the same group may be tightly linked and located at 
the same position in the Glu-3 loci. 

In addition, twelve i-type haplotypes at the Glu-A3 locus 
could be divided into five subgroups (i A -l to i A -5; Fig. 5; 
Supplementary Fig. S12 at JXB online), and s-type haplotypes 
at the Glu-B3 locus were divided into two subgroups, revealing 
the high diversity of the i- and s-type genes/haplotypes among 
wheat varieties. These subgroups were significantly different in 
terms of gene numbers and sequences, which suggests that the 
Glu-A3 and Glu-B3 loci in common wheat might be derived 
from several unique ancestors or have been involved in signifi- 
cant mutational or recombination events during the course of 
their evolution (Figs lb, 5). 

The complex LMW-GS gene family in Chinese wheat 
germplasm 

The LMW-GS gene family was investigated using the MCC 
of Chinese wheat germplasm, which consists of 262 acces- 
sions with an estimated 70% genetic diversity compared with 
the full collection (Hao et al, 2011). Using the MCC, most 
(>15) LMW-GS genes were identified in individual wheat 
varieties. This allowed investigation of the classification and 
relationship of these genes in Chinese wheat germplasm. 

The i-type genes were reported only at the Glu-A3 locus 
(Zhang et al, 2004; Ikeda et al, 2006; Huang and Cloutier, 
2008 ; Dong et al. , 20 1 0; Wang et al. , 20 1 0; Zhang et al. , 20 1 1 a, 
b) or the A genome, for example T. urartu and T. monococ- 
cum (An et al, 2006; Ma et al, 2006; Caballero et al, 2008; 
Long et al, 2008). The findings also indicate that all i-type 
genes detected in MCC accessions were located at the Glu-A3 
locus (Fig. 4). Since i-type genes lacked the sequences encod- 
ing the N-terminal domain, m- and s-type genes without the 
N-terminal domain-coding sequences were used for the phy- 
logenetic analysis. It was found that i-type genes had a closer 
relationship with s-type genes (s BD ) than m-type genes, exclud- 
ing the m D -2 group (Supplementary Fig. S13 at JXB online). 
This results confirmed that the i-type genes may be the result 
of a deletion event of s-type genes (Gao et al, 2007), and the 
i-type genes comprised a relatively young group of LMW-GS 
genes (Juhasz and Gianibelli, 2006). The s-type genes were 
distributed at the Glu-B3 and Glu-D3 loci in common wheat 
and the progenitor of the wheat A genome, T. urartu (data not 
shown), but not at the Glu-A3 locus in common wheat (Fig. 4). 
This suggests that their disappearance from the Glu-A3 locus 
might be the result of elimination at the polyploid level. The 
m-type genes were common at the Glu-3 loci, and the differ- 
ence between the s- and m-types was not significant (D'Ovidio 
and Masci, 2004). The D3-441 gene (m D -2) and s-type genes 
were located at the same main branch, although they belonged 
to different groups (Fig. 4). These data confirm that the s-type 
genes probably originated from m-type genes due to mutation 
of MET to MEN in the N-terminal region (Masci et al, 1998; 
D'Ovidio and Masci, 2004). This also suggests that the m-type 
genes might be the oldest type of LMW-GS gene. 



Genome sequence analysis of Glu-3 loci revealed that both 
i-type and s-type genes/haplotypes existed together with 
the Pm3 analogue and genetic marker SFR159, while most 
m-type genes, m AD , m D -l, and m BD , were tightly linked with 
another genetic marker, WHS179 (Wicker et al, 2003; Gao 
et al, 2007; Dong et al, 2010). Moreover, at the Glu-A3 and 
B3 loci, i-type and s-type genes were distal, while the m^ 
and m BD groups of genes were proximal to the centromere 
(Fig. 5) (Dong et al, 2010). At the Glu-D3 locus, the s-type 
gene D3-578 was also more distal to the centromere than the 
m BD gene, D3-575 (Fig. 5) (Dong et al, 2010). Thus, the phy- 
logenetic analysis (Fig. 4; Supplementary Fig. S13 at JXB 
online), together with the linked genes/markers and their rela- 
tive locations on chromosomes, suggest that i-type and s-type 
genes may have been derived from similar ancestral genes, 
and the m AD (only Glu-A3 genes) and m BD groups of genes 
were orthologues among the A, B, and D subgenomes. 

After sequence alignment and clustering analysis, the 
LMW-GS genes detected in MCC accessions were divided 
into six groups (Fig. 4). The genes at the Glu-A3 locus were 
assigned to the i A and m AD groups, and those at the Glu-B3 
locus were divided into the s BD and m BD groups, whereas 
those at the Glu-D3 locus were distributed widely among the 
m D-i, m D-2> s BD , m BD , and m AD groups (Fig. 4). Thus, the Glu- 
D3 genes showed higher diversity than those at the Glu-A3 or 
Glu-B3 loci in individual wheat varieties. The Glu-A3 locus 
did not share the same group of LMW-GS genes as the Glu- 
B3 locus (Figs 4, 5), which suggested that genes at the Glu- 
A3 and Glu-B3 loci evolved through different routes and had 
a distant evolutionary relationship. All genes at the Glu-B3 
locus and three Glu-D3 genes comprised two groups (s BD and 
m BD ); these homoeoalleles showed close relationships between 
the Glu-B3 and Glu-D3 loci, which were consistent with the 
derivation of BB and DD subgenomes from Aegilops. 

Analysis of the LMW-GS gene family was performed using 
MCC accessions, consisting of foreign varieties, Chinese 
modem varieties, and landraces. The novel i-type haplotype, 
A3-502hl A3-649-1I A3-649-2 was detected only in landraces 
and were absent from Chinese modern or foreign wheat varie- 
ties (Fig. 2a). This also occurred for genotypes containing B3- 
624 and the genotype B3-530IB3-548IB3-688IB3-691 (Fig. 2b). 
Although these haplotypes possessed an equal number of 
active genes to other haplotypes, they have not been selected 
for use in modern wheat breeding programmes. This may be 
because these genes or those linked to them have a detrimental 
effect on bread-making quality or yield potential. In contrast, 
A3-394a/b, B3-601/604, and the genotypes D3-385ID3-376I- 
lD3-432/-/D3-575/D3-578a/-, and 1BL/1RS lines were present 
only in modern or foreign varieties. The presence of these 
genes/genotypes in Chinese modern varieties might be the 
result of incorporation of foreign germplasm in breeding pro- 
grammes in the past several decades. For example, the elite 
1BL/1RS translocation lines were introduced into China in 
the 1970s and were exploited in modern wheat breeding since 
they contain several disease resistance and yield improvement 
genes. Also, the other genes/genotypes were probably intro- 
duced into Chinese wheat germplasm since they (or linked 
genes) increased the yield potential or bread-making quality. 
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Using the LMW-GS gene marker system and full-length 
gene cloning method, a representative population (MCC) was 
investigated, and the composition, organization, variation, and 
expression of LMW-GS genes were evaluated. Furthermore, 
the LMW-GS genes corresponding to all DNA fragments from 
the LMW-GS gene marker system were identified (Table 1; 
Supplementary S 1 at JXB online). The expression profile of these 
genes was revealed by comparing the genomic DNA and cDNA 
data. These data will facilitate the update of the LMW-GS gene 
marker system which can be used to separate, identify, and char- 
acterize LMW-GS genes efficiently in common wheat. 

Supplementary data 

Supplementary data are available at JXB online. 

Figure SI. Sequence alignments of the A3 -391 gene identi- 
fied in the MCC. 

Figure S2. Sequence alignments of the A3-400 gene identi- 
fied in the MCC. 

Figure S3. Sequence alignments of the A3-502 gene identi- 
fied in the MCC. 

Figure S4. Sequence alignments of the B3-530 gene identi- 
fied in the MCC. 

Figure S5. Sequence alignments of the B3-544 gene identi- 
fied in the MCC. 

Figure S6. Sequence alignments of the B3-621 and B3-688 
genes identified in the MCC. 

Figure S7. Sequence alignments of the D3-441 gene identi- 
fied in the MCC. 

Figure S8. Sequence alignments of the D3-525 gene identi- 
fied in the MCC. 

Figure S9. Sequence alignments of the D3-578 gene identi- 
fied in the MCC. 

Figure S10. Sequence alignments of the D3-586 gene iden- 
tified in the MCC. 

Figure SI 1. Sequence alignments of the deduced proteins 
of 15 representative LMW-GS genes from MCC accessions. 

Figure S12. Phylogenetic reconstruction of all i-type 
LMW-GS genes and their allelic variants identified from 
MCC accessions. 

Figure SI 3. Phylogenetic reconstruction of all LMW-GS 
genes and their allelic variants with removed sequences cod- 
ing for N-terminal domains. 

Table SI. Gene-specific primers used for cloning rare allelic 
variants. 

Table S2. Nucleotide sequence identities of LMW-GS 
genes from MCC to the previously reported Glu-A3, B3, and 
D3 alleles/genes. 
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